The multiobjective realisation of the data clusteringproblem has shown great promise in recent years, yieldingclear conceptual advantages over the more conventional, singleobjectiveapproach. Evolutionary algorithms have largely contributedto the development of this increasingly active researcharea on multiobjective clustering. Nevertheless, the unprecedentedvolumes of data seen widely today pose significantchallenges and highlight the need for more effective and scalabletools for exploratory data analysis. This paper proposes animproved version of the multiobjective clustering with automatick-determination algorithm. Our new algorithm improves its predecessorin several respects, but the key changes are related to theuse of an efficient, specialised initialisation routine and two alternativereduced-length representations. These design componentsexploit information from the minimum spanning tree and redefinethe problem in terms of the most relevant subset of its edges.Our study reveals that both the new initialisation routine and thenew solution representations not only contribute to decrease thecomputational overhead, but also entail a significant reduction ofthe search space, enhancing therefore the convergence capabilitiesand overall effectiveness of the method. These results suggest thatthe new algorithm proposed here will offer significant advantagesin the realm of ‘big data’ analytics and applications.
展开▼